{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Serving PyTorch Models In Production Natively With Amazon SageMaker"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Setup Your Hosting Environment\n",
    "The focus of this lab is around model serving. In that vain, we have taken care of of the data preparation and model training. \n",
    "This lab exercise is using a [HuggingFace Transformer](https://huggingface.co/transformers/) which provides us with a general-purpose architecture for Natural Language Understanding (NLU). Specifically, we are presenting you with a [RoBERTa base](https://huggingface.co/roberta-base) transformer that was fined tuned to perform sentiment analysis. The pre-trained checkpoint loads the additional head layers and will output ``positive``, ``neutral``, and ``negative`` sentiment or text. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import sagemaker\n",
    "from sagemaker import get_execution_role\n",
    "from sagemaker.utils import name_from_base\n",
    "from sagemaker.pytorch import PyTorchModel\n",
    "from sagemaker.predictor import RealTimePredictor, json_serializer, json_deserializer\n",
    "import boto3\n",
    "role = sagemaker.get_execution_role()\n",
    "\n",
    "#This is the fine-tuned roberta-base transformer hosted on an S3 bucket.\n",
    "model_artifact = 's3://torchserve-workshop/roberta-fine-tuned.tar.gz'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create Your Endpoint\n",
    "We will now create and deploy our model. To begin, we need to construct a new PyTorchModel object which points to the pre-trained model artifacts from the above step and also points to the inference code that we wish to use. We will then call the deploy method to launch the deployment container on our TorchServe powered Amazon SageMaker endpoint."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Parameter image will be renamed to image_uri in SageMaker Python SDK v2.\n"
     ]
    }
   ],
   "source": [
    "class SentimentAnalysis(RealTimePredictor):\n",
    "    def __init__(self, endpoint_name, sagemaker_session):\n",
    "        super().__init__(endpoint_name, sagemaker_session=sagemaker_session, serializer=json_serializer, \n",
    "                         deserializer=json_deserializer, content_type='application/json')\n",
    "\n",
    "# Note: You can update the 'torchserve-predictor.py' file as needed according to the model you want to use (ie BERT) \n",
    "model = PyTorchModel(model_data=model_artifact,\n",
    "                   name=name_from_base('roberta-model'),\n",
    "                   role=role, \n",
    "                   entry_point='torchserve-predictor.py',\n",
    "                   source_dir='source_dir',\n",
    "                   framework_version='1.6.0',\n",
    "                   predictor_cls=SentimentAnalysis)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "'create_image_uri' will be deprecated in favor of 'ImageURIProvider' class in SageMaker Python SDK v2.\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "---------------!"
     ]
    }
   ],
   "source": [
    "# It will take around 7 minutes for your TorchServe powered endpoint to spin up on Amazon SageMaker \n",
    "endpoint_name = name_from_base('roberta-model') \n",
    "print(endpoint_name)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "predictor = model.deploy(initial_instance_count=1, instance_type='ml.m5.xlarge', endpoint_name=endpoint_name)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Perform Predictions With A TorchServe Backend Amazon SageMaker Endpoint\n",
    "Here, we will pass sample strings of text to the endpoint in order to see the sentiment. We give you one example of each, however, feel free to play around and change the strings yourself! "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'text': 'AWS is excited to announce that TorchServe is natively supported in Amazon SageMaker as the default model server for PyTorch inference'}\n"
     ]
    }
   ],
   "source": [
    "# Our endpoint's model should predict a positive sentiment from the text below\n",
    "test_data = {\"text\": \"AWS is excited to announce that TorchServe is natively supported in Amazon SageMaker as the default model server for PyTorch inference\"}\n",
    "print(test_data)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [],
   "source": [
    "prediction = predictor.predict(test_data)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Review text: {'text': 'AWS is excited to announce that TorchServe is natively supported in Amazon SageMaker as the default model server for PyTorch inference'}\n",
      "Sentiment  : positive\n"
     ]
    }
   ],
   "source": [
    "print(f'Review text: {test_data}')\n",
    "print(f'Sentiment  : {prediction}')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'text': 'TorchServe addresses an industry need.'}\n"
     ]
    }
   ],
   "source": [
    "# Our endpoint's model should predict a neutral sentiment from the text below\n",
    "test_data = {\"text\": \"TorchServe addresses an industry need.\"}\n",
    "print(test_data)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [],
   "source": [
    "prediction = predictor.predict(test_data)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Review text: {'text': 'AWS is excited to announce that TorchServe is natively supported in Amazon SageMaker as the default model server for PyTorch inference'}\n",
      "Sentiment  : positive\n"
     ]
    }
   ],
   "source": [
    "print(f'Review text: {test_data}')\n",
    "print(f'Sentiment  : {prediction}')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{'text': 'I never liked having to convert my models just to deploy them in production!'}\n"
     ]
    }
   ],
   "source": [
    "# Our endpoint's model should predict a negative sentiment from the text below\n",
    "test_data = {\"text\": \"I never liked having to convert my models just to deploy them in production!\"}\n",
    "print(test_data)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [],
   "source": [
    "prediction = predictor.predict(test_data)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Review text: {'text': 'I never liked having to convert my models just to deploy them in production!'}\n",
      "Sentiment  : negative\n"
     ]
    }
   ],
   "source": [
    "print(f'Review text: {test_data}')\n",
    "print(f'Sentiment  : {prediction}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Environment Cleanup: Delete Endpoint, Endpoint Configuration, and Model\n",
    "In order to ensure that we are no longer being billed for the endpoint or it's associated resrouces that we have spun up, we use the below steps to tear the environment down. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "# predictor.delete_endpoint(delete_endpoint_config=True)\n",
    "# predictor.delete_model()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Congratulations!\n",
    "Please head back to the workshop to learn more about the next lab. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "conda_pytorch_p36",
   "language": "python",
   "name": "conda_pytorch_p36"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
